Goto

Collaborating Authors

 Hamilton


Appendix Yang Bo Department of Computing and Software Department of Computing and Software McMaster University

Neural Information Processing Systems

As τ is learned from attention branch. From the Appendix A.1, we obtain the gradient of the sample-wise l Source code for the experiments is available in the zip file. All experiments are implemented in PyTorch and run in a single Nvidia A100 GPU. For CIFAR-10 and CIFAR-100, we do not perform early stopping since we don't assume the presence of clean validation data. All test accuracy are recorded from the last epoch of training.


Noise Attention Learning: Enhancing Noise Robustness by Gradient Scaling Yang Bo Department of Computing and Software Department of Computing and Software McMaster University

Neural Information Processing Systems

Machine learning has been highly successful in data-driven applications but is often hampered when the data contains noise, especially label noise. When trained on noisy labels, deep neural networks tend to fit all noisy labels, resulting in poor generalization. To handle this problem, a common idea is to force the model to fit only clean samples rather than mislabeled ones. In this paper, we propose a simple yet effective method that automatically distinguishes the mislabeled samples and prevents the model from memorizing them, named Noise Attention Learning. In our method, we introduce an attention branch to produce attention weights based on representations of samples.



Human Digital Twins in Personalized Healthcare: An Overview and Future Perspectives

arXiv.org Artificial Intelligence

This evolution indicates an expansion from industrial uses into diverse fields, including healthcare [61], [59]. The core functionalities of digital twins include an accurate mirroring of their physical counterparts, capturing all associated processes in a data-driven manner, maintaining a continuous connection that synchronizes with the real-time state of their physical twins, and simulating physical behavior for predictive analysis [85]. In the context of healthcare, a novel extension of this technology manifests in the form of Human Digital Twins (HDTs), designed to provide a comprehensive digital mirror of individual patients. HDTs not only represent physical attributes but also integrate dynamic changes across molecular, physiological, and behavioral dimensions. This advancement is aligned with a shift toward personalized healthcare (PH) paradigms, enabling tailored treatment strategies based on a patient's unique health profile, thereby enhancing preventive, diagnostic, and therapeutic processes in clinical settings [44], [50]. The personalization aspect of HDTs underscores their potential to revolutionize healthcare by facilitating precise and individualized treatment plans that optimize patient outcomes [72]. Although the potential of digital twins in healthcare has garnered much attention, practical applications remain newly developing, with critical literature highlighting that many implementations are still in exploratory stages [59]. Notably, institutions like the IEEE Computer Society and Gartner recognize this technology as a pivotal component in the ongoing evolution of healthcare systems that emphasize both precision and personalization [31], [89].


Strassen Multisystolic Array Hardware Architectures

arXiv.org Artificial Intelligence

While Strassen's matrix multiplication algorithm reduces the complexity of naive matrix multiplication, general-purpose hardware is not suitable for achieving the algorithm's promised theoretical speedups. This leaves the question of if it could be better exploited in custom hardware architectures designed specifically for executing the algorithm. However, there is limited prior work on this and it is not immediately clear how to derive such architectures or if they can ultimately lead to real improvements. We bridge this gap, presenting and evaluating new systolic array architectures that efficiently translate the theoretical complexity reductions of Strassen's algorithm directly into hardware resource savings. Furthermore, the architectures are multisystolic array designs that can multiply smaller matrices with higher utilization than single-systolic array designs. The proposed designs implemented on FPGA reduce DSP requirements by a factor of $1.14^r$ for $r$ implemented Strassen recursion levels, and otherwise require overall similar soft logic resources when instantiated to support matrix sizes down to 32x32 and 24x24 at 1-2 levels of Strassen recursion, respectively. We evaluate the proposed designs both in isolation and in an end-to-end machine learning accelerator compared to baseline designs and prior works, achieving state-of-the-art performance.



Keep It Light! Simplifying Image Clustering Via Text-Free Adapters

arXiv.org Machine Learning

Many competitive clustering pipelines have a multi-modal design, leveraging large language models (LLMs) or other text encoders, and text-image pairs, which are often unavailable in real-world downstream applications. Additionally, such frameworks are generally complicated to train and require substantial computational resources, making widespread adoption challenging. In this work, we show that in deep clustering, competitive performance with more complex state-of-the-art methods can be achieved using a text-free and highly simplified training pipeline. In particular, our approach, Simple Clustering via Pre-trained models (SCP), trains only a small cluster head while leveraging pre-trained vision model feature representations and positive data pairs. Experiments on benchmark datasets including CIFAR-10, CIFAR-20, CIFAR-100, STL-10, ImageNet-10, and ImageNet-Dogs, demonstrate that SCP achieves highly competitive performance. Furthermore, we provide a theoretical result explaining why, at least under ideal conditions, additional text-based embeddings may not be necessary to achieve strong clustering performance in vision.


HEPPO: Hardware-Efficient Proximal Policy Optimization -- A Universal Pipelined Architecture for Generalized Advantage Estimation

arXiv.org Artificial Intelligence

This paper introduces HEPPO, an FPGA-based accelerator designed to optimize the Generalized Advantage Estimation (GAE) stage in Proximal Policy Optimization (PPO). Unlike previous approaches that focused on trajectory collection and actor-critic updates, HEPPO addresses GAE's computational demands with a parallel, pipelined architecture implemented on a single System-on-Chip (SoC). This design allows for the adaptation of various hardware accelerators tailored for different PPO phases. A key innovation is our strategic standardization technique, which combines dynamic reward standardization and block standardization for values, followed by 8-bit uniform quantization. This method stabilizes learning, enhances performance, and manages memory bottlenecks, achieving a 4x reduction in memory usage and a 1.5x increase in cumulative rewards. We propose a solution on a single SoC device with programmable logic and embedded processors, delivering throughput orders of magnitude higher than traditional CPU-GPU systems. Our single-chip solution minimizes communication latency and throughput bottlenecks, significantly boosting PPO training efficiency. Experimental results show a 30% increase in PPO speed and a substantial reduction in memory access time, underscoring HEPPO's potential for broad applicability in hardware-efficient reinforcement learning algorithms.


Training Fair Models in Federated Learning without Data Privacy Infringement

arXiv.org Artificial Intelligence

Training fair machine learning models becomes more and more important. As many powerful models are trained by collaboration among multiple parties, each holding some sensitive data, it is natural to explore the feasibility of training fair models in federated learning so that the fairness of trained models, the data privacy of clients, and the collaboration between clients can be fully respected simultaneously. However, the task of training fair models in federated learning is challenging, since it is far from trivial to estimate the fairness of a model without knowing the private data of the participating parties, which is often constrained by privacy requirements in federated learning. In this paper, we first propose a federated estimation method to accurately estimate the fairness of a model without infringing the data privacy of any party. Then, we use the fairness estimation to formulate a novel problem of training fair models in federated learning. We develop FedFair, a well-designed federated learning framework, which can successfully train a fair model with high performance without data privacy infringement. Our extensive experiments on three real-world data sets demonstrate the excellent fair model training performance of our method.


Manikin-Recorded Cardiopulmonary Sounds Dataset Using Digital Stethoscope

arXiv.org Artificial Intelligence

Heart and lung sounds are crucial for healthcare monitoring. Recent improvements in stethoscope technology have made it possible to capture patient sounds with enhanced precision. In this dataset, we used a digital stethoscope to capture both heart and lung sounds, including individual and mixed recordings. To our knowledge, this is the first dataset to offer both separate and mixed cardiorespiratory sounds. The recordings were collected from a clinical manikin, a patient simulator designed to replicate human physiological conditions, generating clean heart and lung sounds at different body locations. This dataset includes both normal sounds and various abnormalities (i.e., murmur, atrial fibrillation, tachycardia, atrioventricular block, third and fourth heart sound, wheezing, crackles, rhonchi, pleural rub, and gurgling sounds). The dataset includes audio recordings of chest examinations performed at different anatomical locations, as determined by specialist nurses. Each recording has been enhanced using frequency filters to highlight specific sound types. This dataset is useful for applications in artificial intelligence, such as automated cardiopulmonary disease detection, sound classification, unsupervised separation techniques, and deep learning algorithms related to audio signal processing.